Personal Email Networks: An Effective Anti-Spam Tool

نویسندگان

  • P. Oscar Boykin
  • Vwani P. Roychowdhury
چکیده

We provide an automated graph theoretic method for identifying individual users’ trusted networks of friends in cyberspace. We routinely use our social networks to judge the trustworthiness of outsiders, i.e., to decide where to buy our next car, or to find a good mechanic for it. In this work, we show that an email user may similarly use his email network, constructed solely from sender and recipient information available in the email headers, to distinguish between unsolicited commercial emails, commonly called “spam”, and emails associated with his circles of friends. We exploit the properties of social networks to construct an automated anti-spam tool which processes an individual user’s personal email network to simultaneously identify the user’s core trusted networks of friends, as well as subnetworks generated by spams. In our empirical studies of individual mail boxes, our algorithm classified approximately 53% of all emails as spam or non-spam, with 100% accuracy. Some of the emails are left unclassified by this network analysis tool. However, one can exploit two of the following useful features. First, it requires no user intervention or supervised training; second, it results in no false negatives i.e., spam being misclassified as non-spam, or vice versa. We demonstrate that these two features suggest that our algorithm may be used as a platform for a comprehensive solution to the spam problem when used in concert with more sophisticated, but more cumbersome, content-based filters.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Variable Thresholding In Naïve Bayesian Spam Filters

Email has become an essential means of communication for both business and personal use. However, the proliferation of unwanted email advertising or spam has cost organizations millions of dollars and has reduced the effectiveness of email as a communications medium. Recently, spam filters have been widely adopted as a means of combating these unwanted messages. This paper presents a method for...

متن کامل

ارائه روشی مناسب برای دسته بندی نامه های الکترونیکی تبلیغاتی بر مبنای پروفایل کاربران

In general, Spam is related to satisfy or not satisfy the client and isn’t related to the content of the client’s email. According to this definition, problems arise in the field of marketing and advertising for example, it is possible that some of the advertising emails become spam for some users, and not spam for others. To deal with this problem, many researchers design an anti-s...

متن کامل

Scalable and Reliable Collaborative Spam Filters: Harnessing the Global Social Email Networks

We introduce a collaborative anti-spam system that is based on pervasive global social email networks. Essentially, we provide a solution to this open research problem: given a network of N users who are willing to share information collaboratively (e.g. the digests or ngerprints of known spams), how do we search for each user's content e ciently and reliably in a distributed manner with minima...

متن کامل

Email Spam Filtering: A Systematic Review

Spam is information crafted to be delivered to a large number of recipients, in spite of their wishes. A spam filter is an automated tool to recognize spam so as to prevent its delivery. The purposes of spam and spam filters are diametrically opposed: spam is effective if it evades filters, while a filter is effective if it recognizes spam. The circular nature of these definitions, along with t...

متن کامل

Spam Over Internet Telephony and How to Deal with it

In our modern society telephony has developed to an omnipresent service. People are available at anytime and anywhere. Furthermore the Internet has emerged to an important communication medium. These facts and the raising availability of broadband internet access has led to the fusion of these two services. Voice over IP or short VoIP is the keyword, that describes this combination. The advanta...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cond-mat/0402143  شماره 

صفحات  -

تاریخ انتشار 2004